Linear Convergence with Condition Number Independent Access of Full Gradients
نویسندگان
چکیده
For smooth and strongly convex optimizations, the optimal iteration complexity of the gradient-based algorithm is O( √ κ log 1/ǫ), where κ is the condition number. In the case that the optimization problem is ill-conditioned, we need to evaluate a large number of full gradients, which could be computationally expensive. In this paper, we propose to remove the dependence on the condition number by allowing the algorithm to access stochastic gradients of the objective function. To this end, we present a novel algorithm named Epoch Mixed Gradient Descent (EMGD) that is able to utilize two kinds of gradients. A distinctive step in EMGD is the mixed gradient descent, where we use a combination of the full and stochastic gradients to update the intermediate solution. Theoretical analysis shows that EMGD is able to find an ǫ-optimal solution by computing O(log 1/ǫ) full gradients and O(κ log 1/ǫ) stochastic gradients.
منابع مشابه
Preconditioned Generalized Minimal Residual Method for Solving Fractional Advection-Diffusion Equation
Introduction Fractional differential equations (FDEs) have attracted much attention and have been widely used in the fields of finance, physics, image processing, and biology, etc. It is not always possible to find an analytical solution for such equations. The approximate solution or numerical scheme may be a good approach, particularly, the schemes in numerical linear algebra for solving ...
متن کاملDistributed Multiagent Optimization: Linear Convergence Rate of ADMM
We propose a distributed algorithm based on Alternating Direction Method of Multipliers (ADMM) to minimize the sum of locally known convex functions. This optimization problem captures many applications in distributed machine learning and statistical estimation. We provide a novel analysis that shows if the functions are strongly convex and have Lipschitz gradients, then an -optimal solution ca...
متن کاملDSA: Decentralized Double Stochastic Averaging Gradient Algorithm
This paper considers convex optimization problems where nodes of a network have access to summands of a global objective. Each of these local objectives is further assumed to be an average of a finite set of functions. The motivation for this setup is to solve large scale machine learning problems where elements of the training set are distributed to multiple computational elements. The decentr...
متن کاملStochastic Gradient MCMC with Stale Gradients
Stochastic gradient MCMC (SG-MCMC) has played an important role in largescale Bayesian learning, with well-developed theoretical convergence properties. In such applications of SG-MCMC, it is becoming increasingly popular to employ distributed systems, where stochastic gradients are computed based on some outdated parameters, yielding what are termed stale gradients. While stale gradients could...
متن کاملFinal Iterations in Interior Point Methods | Preconditioned Conjugate Gradients and Modiied Search Directions
In this article we consider modiied search directions in the endgame of interior point methods for linear programming. In this stage, the normal equations determining the search directions become ill-conditioned. The modiied search directions are computed by solving perturbed systems in which the systems may be solved ef-ciently by the preconditioned conjugate gradient solver. We prove the conv...
متن کامل